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Abstract Body 



Background / Context: 

Attrition occurs when study participants who were assigned to the treatment and eontrol 
eonditions do not provide outeome data and thus do not eontribute to the estimation of the 
treatment effeets. It is very eommon in experimental studies in edueation as illustrated, for 
instanee, in a meta-analysis studying “the effeets of attrition on baseline eomparability in 
randomized experiments in edueation” (Valentine & MeHugh, 2007) that found that 1 19 of 367 
randomized edueation experiments reported student-level attrition. Shadish et al (1998) ealled 
attrition the Aehilles’ heel of randomized experiments. Attrition reduees statistieal power by 
deereasing sample size. It eompromises external validity when those who do not eontribute data 
are unrepresentative of the original sample and thus degrade the representativeness of those who 
remain in the sample (Orr, 1999). 

Of greatest eoneern in experimental studies, however, is the potential for attrition to threaten 
internal validity (Campbell & Stanley, 1963; Little & Rubin, 1987; Shadish, Cook, & Campbell, 
2001). In the Little & Rubin (1987) missing data framework, if attrition involves eases missing 
eompletely at random (MCAR: does not depend on observed or unobserved variables) or missing 
at random (MAR: eonditional on observed variables) unbiased treatment effeet estimates ean be 
derived from the obtained data. However, if the attrition is not random under either of these 
definitions (is related to at least some extent to unobserved variables), it ean bias the treatment 
effeet estimate. Moreover, that bias ean be large even when advaneed statistieal methods are 
used to address the attrition, e.g., multiple imputation (Foster & Fang, 2004; Puma, Olsen, Bell, 

& Priee, 2009). Unfortunately for the internal validity of experimental studies, it is rarely the 
ease that the faetors that ereate attrition ean be safely assumed to be represented among the 
observed variables or to be a virtually random proeess. 

It should also be noted that attrition may not only bias the point estimate of the treatment 
effeet but also the standard error of that estimate. Henee, it may distort the results of statistieal 
signifieanee tests and thereby threaten statistieal eonelusion validity. 

We foeus in this presentation on the threat attrition poses for internal validity. From that 
perspeetive, the eritieal questions for experimental studies are how mueh potential bias different 
levels and kinds of attrition might produee in the effeet estimates and how mueh is too mueh to 
allow eonfidenee in those estimates. The What Works Clearinghouse Proeedures and Standards 
Handbook (version 2.0) set a standard of 0.05 standard deviation or less on the outeome variable 
as an aeeeptable level of bias (U.S. Department of Edueation, 2008). The overall attrition rates 
and differential attrition rates between treatment and eontrol groups that eould result in that mueh 
bias or more was estimated through a simulation study. 

The team that did the work on attrition for the WWC handbook made a major eontribution to 
understanding this issue. Remarkably, given how mueh awareness there is in the experimental 
researeh eommunity of the problems assoeiated with attrition, little systematie researeh has been 
done to provide a framework for appraising attrition and assessing its potential to bias effeet 
estimates. The most notable gaps in the eurrent literature on this topie inelude: (1) laek of a well- 
developed eoneeptual and statistieal framework that makes the sourees of attrition bias and their 
influenee elear; (2) the potential of well-ehosen eovariates to reduee attrition bias; (3) the nature 
and influenee of attrition in eluster randomized eontrolled trials; (4) attrition bias in the standard 
errors of effeet estimates; and (5) how to use baseline information most effeetively in estimating 
the potential attrition bias and what baseline information is most useful for that purpose. We 
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have explorations of these issues underway. The proposed presentation will report results to date 
on the first three issues in this list. 

Purpose / Objective / Research Question / Focus of Study: 

The main purpose of the study to be presented is to elaborate a model of the relationships 
between attrition and effeet estimates and to use that model to guide Monte Carlo simulations 
that examine the sourees and magnitude of attrition bias under various assumptions for 
randomized experiments and eluster randomized experiments. 

Significance / Novelty of study: 

This study contributes to methodology and practice for assessing attrition bias in randomized 
experiments and cluster randomized experiments in education. Its results provide guidance for 
identifying the conditions under which attrition bias might present a serious threat to internal 
validity and suggestions for ways to reduce that bias. 

Research Design: 

This is a simulation study that examines attrition bias in completely randomized controlled 
trials (RCT) and cluster randomized controlled trials (CRT). For each of these designs, this 
project involves three components: (1) specifying a model of attrition bias; (2) specifying a 
model for treatment effect estimates that can be linked to attrition bias; and (3) using these 
models to generate data to simulate different attrition situations and their influence on effect 
estimates. 



Completely Randomized Controlled Trials (RCT) 
1 . Model of Attrition Bias 



We start from the simplest model, i.e., the model without any covariates, which was used in 
the WWC Procedures and Standards Handbook. We will then move to a model with covariate. 
Let a random variable, z, represent an individual’s latent propensity to respond. Assume z has 



a normal distribution, z ~ A(0, 1). The response rate of the sample is p. 



P = 



N 



respondent 



N 



total 



.An 



individual is a respondent if her/his propensity value z exceeds a threshold: 

(1) z > Q{z, 1 -p) 

where Q is the quantile function of the normal distribution, i.e., the inverse of the cumulative 
distribution function. Given the response rate p, if z exceeds the value that corresponds to the 
percentile (1 - />) of the normal distribution, then that individual responds on the outcome 
measure. 

The outcome at follow-up, y, may be influenced by the propensity to respond, z, so can be 
modeled as follows with all other influences represented for the moment in the random term, e. 

ad 

(2) y = l3,z + e y - N(0, \), z - N(0, e ~ N(0,a^) 



The correlations ofy and z are assumed to be: Corr(y,z) = , we will have . 

When the treatment variable is included, we allow for the possibility of an interaction 
between treatment conditions and the propensity to respond and the model becomes: 

(3) y = P,z + ySi (TREAT) + (TREAT)z + e 
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This model assumes that there is a treatment effeet and that it may varies conditional on z. /Ij , is 
the average treatment effect when z equals to 0. 

Expressed by the treatment status: 

For the control group: 

ad 

(4) Tc = l^z^c yc- MO, 1), A^(0, 1), e ~ N{Q,ol) . 

The correlations of yc and Zc are assumed to be: Corr(yc,Zc) = we will have 
For the treatment group: 

(5) T? = A + (Pz + 1^2)^ t y, ~N{Yj, 1), where 7, is the mean of y, , 1)*, 

ad 

e~N{Q,o^). 

The correlations of y? and Zt are assumed to be: Con{yt,Zt) = > we will have + /I 2 = ■ 

Regarding the attrition model with covariate (e.g., pretest), we use the following model 
where x is the covariate : 

(6) y = l3^x + I3^z + ySi {TREAT) + (TREAT)x + 13^ {TREAT)z + e 

ad 

y ~iV(0, 1), X 1), z ~iV(0, 1), e~N{0,ol) 

The correlations ofy, x, and z are assumed to be: 

Corr(y^) = Corr(y,z) = Corr(x,z) = 

2. Model of Impact Estimate 

First, we illustrate the sources of bias using the attrition model without covariate (Models 1- 
5). The treatment effect ( <5 ) is the difference between the treatment and control group means on 
the outcome variable, y: 

(7) S=Y,-Y^=^j+ Py^z^ - r^y^z^ = ^, + + !5^)z^ - r^^z^ 

For the whole sample without attrition, z^ = z^ = 0 and the treatment effect is 

Furthermore, according to Expression 1 , for any given response rate p, we can find the 
threshold value for the propensity to respond, Zg (or vice versa). The respondent sample consists 

of the people whose propensity to respond is above this threshold. The distribution of z, the 
propensity to respond, for those who do respond is a truncated standard normal distribution 
(zG(zo,°o)). 

Bias of the treatment effect estimate is defined as the difference between the treatment effect 
estimate derived from the full original sample, if that estimate were known, and that derived 
from the sample of those who actually respond: 

(8) Bias = Py^Zf - ry^z^ , where z^ and z^ are for the respondent sample. 

Rearranging Expression 7, we have 



Note that the actual functions of propensity to respond could be different between the treatment and control groups. However, 
after standardization, they will be same, i.e., standard normal distribution. 
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(9) Bias = {r;^-rl^){ 



yz J 



Z, + Z ^1,7 + ^V7 



z ^ z 

From this formulation, we see that bias is the sum of two parts: “ ^yz\ ‘ 2 

-z^). The first part eoneems the differenee of eorrelations between y and z for two 

groups and the average z of two groups, whieh is determined by the average response rate. The 
seeond part eoneems the average correlation between y and z for two groups and the difference 
of mean z between the treatment and control groups, which is due to the differential response 
rates. In short, both the average and differential correlations between y and z for the treatment 
and control groups, and both the average and differential response rates between the treatment 
and control groups contribute to the overall bias, and the magnitude of that bias depends on the 
bias directions of these two parts. 

The mean of z for the tmncated standard normal distribution can be calculated using the 
formula below (Barr & Sherrill, 1999): 

\ _ 2 

(10) E{z) = I — e “ , where ^(zq) is the standard normal CDF (cumulative 

V2jt(1 -0(zq)) 

distribution function), and it equals to the attrition rate, i.e., (1- />). 

There are many simple formulas to calculate ^(zq) from Zq (or vice versa). Based on Shah’s 
approximation (1985, p. 80), we can calculate Zq using the formula below: 

(1 1) Zq = -2.2 + 0.5.^40(1- p)-0.64 , - 2.2 < Zq < 0 (equivalent to 0.5 < /> < 0.98 ). 

Using Expressions 9,10 and 1 1 , we can easily calculate bias for any given parameters ( , 

Pc’ 4^’ and rp. 



To estimate the treatment effect and bias for the model with a covariate, we used simulation 
based on the model below: 

(12) y = /Sq + 13* X + (3* {TREAT) + (TREAT)x + e* 

f3 * , is the average treatment effect when x equals to 0. The average treatment effect at the 

average x is ( ySj* + f3lx ), where x is the mean of x. Model 12 can produce an unbiased impact 
estimate for the complete sample because z is unrelated to the treatment status in the complete 
sample. However, Model 12 may produce a biased impact estimate for the respondent sample 

because z may be related to the treatment status in the respondent sample. If we set ySj as 0, f3* 

and ( ySj* + f3lx ) estimated from the respondent sample are the overall biases of the treatment 
effect in raw score at x of 0 and the average, respectively. 

3. Simulation Procedure 

We used a macro program written in SAS 9.2 to generate and analyze data for the model with 
covariate. Based on the model of attrition bias, the parameters that we can manipulate include: 
response rate for the control group (p^), response rate for the treatment group {p,), , ry ^ , ’ 
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Vy^, Ty^, r^, and ySj . To simplify, we set /?j as 0. Given these specific parameters, we first 
calculated (3^, (3^, ^ 2 , /I 3 , and cr^ . To create one master dataset containing the whole 
sample, which consists of 5,000 observations, we apply a Cholesky decomposition to create 
correlated random variables, x and z for any given (or r^). We created 2,500 observations 
for the treatment group and another 2,500 observations for the control group. The outcome 
variable, y, can be generated based on Model 6 . The respondent sample includes those who 
satisfy Model 1. For each parameter combination, we generate 1,000 master datasets. These 
1,000 datasets are analyzed using Model 12. The estimates of bias andM5£' (mean square error) 
can be calculated. 

Cluster Randomized Controlled Trials (CRT) 

Attrition at level 1, level 2, and both levels will be examined using similar models (but taking 
the hierarchical structure into account) and simulations. These results will also be reported in the 
proposed presentation. 

Findings / Results: 

Completely Randomized Controlled Trials (RCT) 

Tables 1 and 2 present the findings for bias calculated using Expressions 9, 10, and 1 1 for the 
model without a covariate. The parameters used in Table 1 are a subset of parameters used in 
Table A1 in the WWC Procedures and Standards Handbook. The overall attrition bias is 
identical with that reported there. Note that, as expected, when the response/attrition rate is same 
for treatment and control, the part 2 bias is 0. The overall attrition bias comes totally from Part 1 
bias, i.e., bias due to the differential correlations between y and z for the treatment and control. 

Table 2 presents the results of bias for ~ fL- The correlations are the average of 

Ky^ in Table 1. We can see that Part 1 bias is always 0 because of no differential correlations 

between y and z for the treatment and control groups. In addition, when the response rate is same, 
there is no attrition bias. 

Table 3 presents the results of bias for the model with a covariate, x, assumed to be a pretest 
with ry^= Vy^= 0.7 and r^^ = rl^=03. Comparing with Table 1, we find that bias was reduced 
when Pf, which means that including covariate in the model can reduce Part 2 bias. 

Conclusions: 

This study modeled the sources of attrition bias under various assumptions for completely 
randomized controlled trials (RCT) and (to be provided by the time of the SREE meeting) cluster 
randomized controlled trials (CRT). The overall bias is associated with both the overall attrition 
rate and the differential attrition rate, and both the overall and differential correlations between y 
and z for the treatment and control groups. In addition, these results show that bias can be 
reduced by including baseline covariates in the impact estimate model if those covariates are 
correlated with both the latent propensity to respond and the outcome variable. 
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Appendix B, Tables and Figures 

Table 1 . Attrition Bias for ryz (t) 5 ^ ryz (c) (without covariate) 







Parameter 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 






R z(t) 


0.075 


0.10 


0.15 


0.20 


0.30 


0.50 


Pt 


Pc 


R^z(c) 


0.05 


0.05 


0.05 


0.15 


0.20 


0.20 






fyz (t) 


0.2738 


0.3162 


0.3873 


0.4472 


0.5477 


0.7071 






fyz (c) 


0.2236 


0.2236 


0.2236 


0.3873 


0.4472 


0.4472 






Overall attrition bias 


0.01 


0.02 


0.03 


0.01 


0.02 


0.05 


0.90 


0.90 


Part 1 bias 


0.01 


0.02 


0.03 


0.01 


0.02 


0.05 






Part 2 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Overall attrition bias 


0.05 


0.06 


0.08 


0.08 


0.10 


0.14 


0.85 


0.95 


Part 1 bias 


0.01 


0.02 


0.03 


0.01 


0.02 


0.05 






Part 2 bias 


0.04 


0.04 


0.05 


0.07 


0.08 


0.09 






Overall attrition bias 


0.03 


0.05 


0.08 


0.03 


0.05 


0.13 


0.70 


0.70 


Part 1 bias 


0.03 


0.05 


0.08 


0.03 


0.05 


0.13 






Part 2 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Overall attrition bias 


0.06 


0.09 


0.13 


0.09 


0.12 


0.21 


0.65 


0.75 


Part 1 bias 


0.03 


0.05 


0.08 


0.03 


0.05 


0.13 






Part 2 bias 


0.04 


0.04 


0.05 


0.06 


0.07 


0.09 



Note. Entries are biases calculated using Expressions 9, 10, and 11. Part 1 and Part 2 biases were two 
terms in Expression 9. Pt and Pc are the response rates for the treatment and control group, respectively. 
The overall attrition bias may not equal the sum of the selection bias and omitted variable because of 
rounding. 
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Table 2. Attrition Bias for ryz(t)= ryz(c) (without covariate) 







Parameter 


(1) 


(2) 


( 3 ) 


( 4 ) 


( 5 ) 


(6) 


Pt 


Pc 


R z(t)~ R z(c) 


0.06 


0.07 


0.09 


0.17 


0.25 


0.33 






Tyz (t)= Tyz (c) 


.2487 


.2699 


.3055 


.4173 


.4975 


.5772 






Overall attrition bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.90 


0.90 


Part 1 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Part 2 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Overall attrition bias 


0.04 


0.04 


0.05 


0.07 


0.08 


0.09 


0.85 


0.95 


Part 1 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Part 2 bias 


0.04 


0.04 


0.05 


0.07 


0.08 


0.09 






Overall attrition bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.70 


0.70 


Part 1 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Part 2 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Overall attrition bias 


0.04 


0.04 


0.05 


0.06 


0.07 


0.09 


0.65 


0.75 


Part 1 bias 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 






Part 2 bias 


0.04 


0.04 


0.05 


0.06 


0.07 


0.09 



Note. Entries are biases calculated using Expressions 9, 10, and 11. Part 1 and Part 2 biases were two 
terms in Expression 9. Pt and Pc are the response rates for the treatment and control group, respectively. 
The overall attrition bias may not equal the sum of the selection bias and omitted variable because of 
rounding. 
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Table 3. Attrition Bias with Covariate, x ( 0.7 and r^^ = r‘^^=0.3) 







Parameter 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


Pt 


Pc 


Tyz ( t ) 


0.2738 


0.3162 


0.3873 


0.4472 


0.5477 


0.7071 






Tyz ( c ) 


0.2236 


0.2236 


0.2236 


0.3873 


0.4472 


0.4472 


0.90 


0.90 


Overall attrition bias 


0.01 


0.02 


0.03 


0.01 


0.02 


0.05 


0.85 


0.95 


Overall attrition bias 


0.02 


0.03 


0.05 


0.05 


0.07 


0.11 


0.70 


0.70 


Overall attrition bias 


0.02 


0.05 


0.08 


0.03 


0.05 


0.13 


0.65 


0.75 


Overall attrition bias 


0.03 


0.06 


0.10 


0.06 


0.09 


0.19 



Note. N= 5,000. 1,000 replications. Pt and Pc are the response rates for the treatment and control group, 
respectively. ES is in terms of pooled standard deviation for the whole sample. The overall attrition bias 
may not equal the sum of the selection bias and omitted variable because of rounding. 
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